Unsupervised Morphological Analysis Using Tries
نویسندگان
چکیده
This article presents an unsupervised morphological analysis algorithm to segment words into roots and affixes. The algorithm relies on word occurrences in a given dataset. Target languages are English, Finnish, and Turkish, but the algorithm can be used to segment any word from any language given the wordlists acquired from a corpus consisting of words and word occurrences. In each iteration, the algorithm divides words with respect to occurrences and constructs a new trie for the remaining affixes. Preliminary experimental results on three languages show that our novel algorithm performs better than most of the previous algorithms.
منابع مشابه
A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation
In this paper, we introduce a trie-structured Bayesian model for unsupervised morphological segmentation. We adopt prior information from different sources in the model. We use neural word embeddings to discover words that are morphologically derived from each other and thereby that are semantically similar. We use letter successor variety counts obtained from tries that are built by neural wor...
متن کاملA Nonlinear Grayscale Morphological and Unsupervised method for Human Facial Synthesis Based on an Example Image
Human facial generation of example image is used as a requirement for biometric applications for the purpose of identifying individuals. In this paper, face generation consists of three main steps. In the first step, detection of significant lines and edges of the example image are carried out using nonlinear grayscale morphology. Then, hair areas are identified from the face of sample. The fin...
متن کاملBootstrapping Morphological Analysis of Gı̃kũyũ Using Unsupervised Maximum Entropy Learning
This paper describes a proof-of-the-principle experiment in which maximum entropy learning is used for the automatic induction of shallow morphological features for the resourcescarce Bantu language of Gı̃kũyũ. This novel approach circumvents the limitations of typical unsupervised morphological induction methods that employ minimum-edit distance metrics to establish morphological similarity bet...
متن کاملBootstrapping morphological analysis of gĩkũyũ using unsupervised maximum entropy learning
This paper describes a proof-of-the-principle experiment in which maximum entropy learning is used for the automatic induction of shallow morphological features for the resourcescarce Bantu language of Gı̃kũyũ. This novel approach circumvents the limitations of typical unsupervised morphological induction methods that employ minimum-edit distance metrics to establish morphological similarity bet...
متن کاملUsing Morphological and Distributional Cues for Inductive Part-of-Speech Tagging
In this paper we evaluate the role of morphological and distributional cues in PoS induction, using an incremental and unsupervised learning algorithm with clustering on a vector space.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011